LM Studies on Filled Pauses in Spontaneous Medical Dictation

نویسنده

  • Jochen Peters
چکیده

We investigate the optimal LM treatment of abundant filled pauses (FP) in spontaneous monologues of a professional dictation task. Questions addressed here are (1) how to deal with FP in the LM history and (2) to which extent can the LM distinguish between positions with high and low FP likelihood. Our results differ partly from observations reported on dialogues. Discarding FP from all LM histories clearly improves the performance. Local perplexities, entropies and word rankings at positions following FP suggest that most FP indicate hesitations rather than restarts. Proper prediction of FP allows to distinguish FP from word positions by a doubled FP probability. Recognition experiments confirm the improvements found in our perplexity studies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling Filled Pauses in Medical Dictations

Filled pauses are characteristic of spontaneous speech and can present considerable problems for speech recognition by being often recognized as short words. An um can be recognized as thumb or arm if the recognizer's language model does not adequately represent FP's. Recognition of quasi-spontaneous speech (medical dictation) is subject to this problem as well. Results from medical dictations ...

متن کامل

Filled Pause Distribution and Modeling in Quasi-Spontaneous Speech

Filled pauses (FP) are characteristic of spontaneous speech and present considerable problems for speech recognition by being often recognized as short words. Recognition of quasispontaneous speech (medical dictation) is subject to this problem as well. An um can be recognized as thumb or arm if the recognizer’s language model does not adequately represent FP’s. Representing FP’s in the trainin...

متن کامل

Synthesising Filled Pauses: Representation and Datamixing

Filled pauses occur frequently in spontaneous human speech, yet modern text-to-speech synthesis systems rarely model these disfluencies overtly, and consequently they do not output convincing synthetic filled pauses. This paper presents a text-to-speech system that is specifically designed to model these particular disfluencies more efffectively. A preparatory investigation shows that a synthet...

متن کامل

Modeling spontaneous speech variability in professional dictation

In this work, we present a model combination approach at the word level that aims to improve the modeling of spontaneous speech variabilities on a highly spontaneous, real life medical transcription task. The technique (1) separates speech variabilities into pre-defined classes, (2) generates speech variability specific acoustic and pronunciation models and (3) properly combines these models la...

متن کامل

Pauses and hesitations in French spontaneous speech

In traditional terminology, silent and filled pauses are grouped together, whereas hesitation lengthening is put into a separate category. However, while these various phenomena are very often associated, there have been few studies on how they interact. We analyzed an hour of spontaneous speech to show that silent and filled pauses operate in a totally different way, and that contrary to commo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003